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PATENT 

Attorney Docket No.: 015358-009420US 
Client Reference No. : ID-RII-3 1 7 

TECHNIQUES FOR PERFORMING OPERATIONS ON A SOURCE 

SYMBOLIC DOCUMENT 

CROSS-REFERENCES TO RELATED APPLICATIONS 
5 [0001] The present application is a non-provisional of and claims priority from U.S. 

Provisional Application No. 60/462,412, Attorney Docket No. 15358-009400US, filed April 
11, 2003, the entire contents of which are herein incorporated by reference for all purposes. 

[0002] The present application incorporates by reference for all purposes the entire contents 
of the following: 

10 [0003] U.S. Application No. 10/412,757, Attorney Docket No. Docket No. 15358- 
009500US, filed April 1 1, 2003; 

[0004] U.S. Application No. _/__, , Attorney Docket No. 1 5358-009410US, filed 

concurrently with the present application; 

[0005] U.S. Application No. _/__, , Attorney Docket No. 15358-009430US, filed 

15 concurrently with the present application; and 

[0006] U.S. Application No. 10/001,895, Attorney Docket No. 15358-006500US, filed 
November 19, 2001. 

BACKGROUND OF THE INVENTION 
20 [0007] The present invention generally relates to techniques for determining actions to 

perform and more specifically to techniques for using recorded information and portions of a 
source document to determine actions to perform. 

[0008] Presentations are a powerful tool for presenting information to participants. During 
the presentation, slides from a source document are outputted and displayed while a presenter 
25 may describe or provide explanation for the outputted slides. At certain points during the 
presentation, certain actions may be desired. For example, a participant may want to view a 
translated slide of the outputted slide. Conventionally, in order to view the translated slide, a 
participant must recognize that a certain slide has been outputted and then manually initiate a 



translation program to translate the outputted slide or manually retrieve a pre-translated slide 
that corresponds to the outputted slide. 

[0009] In addition to the above action, other actions may be desired, such as notifying 
participants that a presentation has started when a first slide in a source document is outputted 
5 and displayed. Also, different actions may be desired when different slides are outputted and 
displayed. For example, a video may be played or an e-mail may be sent when a certain slide 
is outputted. Conventionally, these actions are performed manually. 

[0010] Accordingly, there is a need for automated techniques for determining actions to 
perform based on recorded information of an outputted slide from a source document. 

10 BRIEF SUMMARY OF THE INVENTION 

[0011] Embodiments of the present invention generally relate to determining actions to 
perform. Embodiments of the present invention access recorded information. A source 
document is then determined using the recorded information. If a criterion is satisfied based 
on the recorded information and the source document, an action to be performed is 

15 determined. The action is then performed if it is determined that the criterion is satisfied. 

[0012] In one embodiment, techniques performing an action are provided. The techniques 
include: accessing recorded information; determining a source document using the recorded 
information; determining if a criterion is satisfied based on the recorded information and the 
source document; determining an action to be performed if the criterion is satisfied; and 
20 performing the action if it is determined that the criterion is satisfied. 

[0013] In another embodiment, techniques for performing an action are provided. The 
techniques include: accessing a first piece of information in recorded information, the first 
piece of information including information in a source document; comparing the first piece of 
information to information in the source document to determine information in the source 
25 document that matches the first piece of information; determining if a criterion is satisfied 
based on the first piece of information and matched information in the source document; 
determining an action to be performed if the criterion is satisfied; and performing the action 
if it is determined that the criterion is satisfied. 

[0014] In yet another embodiment, techniques for determining translated slides of source 
30 document slides in a source document are provided. The techniques include: accessing 

recorded information; determining a source document slide in the source document using the 
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recorded information; determining a translated slide of the source document slide; and 
communicating the translated slide to a device. 

[0015] A further understanding of the nature and advantages of the inventions herein may 
be realized by reference of the remaining portions in the specifications and the attached 
5 drawings. 

BRIEF DESCRIPTION OF THE DRAWINGS 
[0016] Fig. 1 depicts a simplified block diagram of a system for capturing information used 
to determine actions to perform according to one embodiment of the present invention; 

[0017] Fig. 2 is a simplified block diagram of a data processing system that may 
10 incorporate an embodiment of the present invention; 

[0018] Fig. 3 depicts a simplified flow chart of a method for using recorded information 
and portions of a source document to determine actions to perform according to one 
embodiment of the present invention; 

[0019] Fig. 4 illustrates a simplified block diagram of a system for performing actions 
15 using recorded information and information from a source document according to one 
embodiment of the present invention; and 

[0020] Fig. 5 illustrates a system that depicts the automatic translation of slides being 
presented according to one embodiment of the present invention. 

DETAILED DESCRIPTION OF THE INVENTION 
20 [0021] Fig. 1 depicts a simplified block diagram 100 of a system for capturing information 
used to determine actions to be performed according to one embodiment of the present 
invention. It will be apparent that system 100 as depicted in Fig. 1 is merely illustrative of an 
embodiment incorporating the present invention and does not limit the scope of the invention 
as recited in the claims. One of ordinary skill in the art will recognize other variations, 
25 modifications, and alternatives. 

[0022] A presentation driver device 102 and a display device 104 are used to output slides 
and other information that may be stored in a source document 108 or a symbolic 
presentation file. For example, slides from a Powerpoint™ (PPT) presentation may be output 
and displayed on display device 104. In one embodiment, the term "source document" as 
30 used in this application is intended to refer to any document stored in electronic form. For 
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example, a document that is created using an application program and whose contents or 
portions thereof may be a source document. Also, source documents may be scanned 
documents, a PDF version of a document, an image of a document, etc. The contents of 
source document 108 may include slides, images, text information, etc. A source document 
5 may comprise one or more portions. For example, a PPT document may comprise one or 
more pages. Each page of a PPT document may comprise one or more slides. The portions 
of source document 108 will be referred to as slides for discussion purposes but it will be 
understood that a slide may also be one or more images, one or more pages of a document, 
etc. Source document 108 may be created using one or more application programs. For 
10 example, a PPT document may be created using an application program, such as Microsoft's 
Powerpoint™. Source document 108 is an electronic document that may be manipulated and 
edited using the application program that created it, or any other program. 

[0023] In one embodiment, source document 108 is different than a captured image of a 
slide, which has not been created by an application program and is often not directly editable 

15 by the application program. For example, a PPT document comprising one or more slides 
created using a Powerpoint™ application program can be easily edited by the Powerpoint™ 
application. In contrast, a joint photographies group (JPEG) image of the displayed slide is 
not created by the Powerpoint™ application but is recorded information. Although a PPT 
document may contain JPEG images, the JPEG images are included in a slide created by a 

20 PPT application. 

[0024] When a slide of source document 108 is displayed on display device 104, it is 
referred to as an outputted slide 106. For example, outputted slide 106 is a slide or image 
from source document 108 that has been outputted and displayed. 

[0025] While a presenter is giving a presentation, the presenter may display slides from 
25 source document 108 on display device 104. While a slide is being displayed on display 
device 104, the presenter will then often describe or explain the contents of the displayed 
slide. For example, the presenter may embellish on the text or images displayed in 
multimedia document 114. Attendees of the presentation may also comment on the displayed 
slide (e.g., ask questions about the slide, etc.). The information output during a presentation, 
30 including information output by display device 104, by the presenter, by attendees of the 
presentation, or any information captured during the presentation may be captured or 
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recorded using one or more capture devices 110. Examples of presentations include lectures, 
meetings, speeches, conferences, classes, demonstrations, etc. 

[0026] Information recorded or captured during a presentation may include text 
information, graphics information, animation information, sound (audio) information, video 

5 information, slides information, whiteboard images information, and other types of 

information. For example, a video recording of presentation may comprise video information 
and/or audio information. In certain instances the video recording may also comprise close- 
captioned (CC) text information which comprises material related to the video information, 
and in many cases, is an exact representation of the speech contained in the audio portions of 

10 the video recording. Recorded information is also used to refer to information comprising 

one or more objects wherein the objects include information of different types. For example, 
objects included in recorded information may comprise text information, graphics 
information, animation information, sound (audio) information, video information, slides 
information, whiteboard images information, and other types of information. 

1 5 [0027] In one embodiment, the recorded information may be stored in a multimedia 

document 1 14 in a database 115. Alternatively, the recorded information may be processed 
in real-time as it is captured. The term "multimedia document" as used in this application is 
intended to refer to any electronic storage unit (e.g., a file, a directory, etc.) that stores 
recorded information. Various different formats may be used to store the recorded 

20 information. These formats include various MPEG formats (e.g., MPEG 1, MPEG 2, MPEG 
4, MPEG 7, etc.), MP3 format, SMIL format, HTML+TIME format, WMF (Windows Media 
Format), RM (Real Media) format, Quicktime format, Shockwave format, various streaming 
media formats, formats being developed by the engineering community, proprietary and 
customary formats, and others. Examples of multimedia documents 1 14 include video 

25 recordings, MPEG files, news broadcast recordings, presentation recordings, recorded 

meetings, classroom lecture recordings, broadcast television programs, papers, or the like. 

[0028] Capture device 1 10 is configured to capture information presented at a presentation. 
Various different types of information output during a presentation may be captured or 
recorded by capture devices 118 including audio information, video information, images of 
30 slides or photos, whiteboard information, text information, and the like. For purposes of this 
application, the term "presented" is intended to include displayed, output, spoken, etc. For 
purposes of this application, the term "capture device" is intended to refer to any device, 
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system, apparatus, or application that is configured to capture or record information of one or 
more types. Examples of capture devices 1 10 include microphones, video cameras, cameras 
(both digital and analog), scanners, presentation recorders, screen capture devices (e.g., a 
whiteboard information capture device), symbolic information capture devices, etc. In 
5 addition to capturing the information, capture devices 110 may also be able to capture 
temporal information associated with the captured information. 

[0029] A presentation recorder is a device that is able to capture information presented 
during a presentation, for example, by tapping into and capturing streams of information from 
an information source. For example, if a computer executing a PowerPoint™ application is 

10 used to display slides from a *.ppt file, a presentation recorder may be configured to tap into 
the video output of the computer and capture keyframes every time a significant difference is 
detected between displayed video keyframes of the slides. The presentation recorder is also 
able to capture other types of information such as audio information, video information, 
slides information stream, etc. The temporal information associated with the captured 

1 5 information indicating when the information was output or captured is then used to 

synchronize the different types of captured information. Examples of presentation recorders 
include a screen capture software application, a PowerPoint™ application that allows 
recording of slides and time elapsed for each slide during a presentation, presentation records 
described in U.S. Application No. 09/728,560, filed November 30, 2000 (Attorney Docket 

20 No. 15358-006210US), U.S. Application No. 09/728,453, filed November 30, 2000 (Attorney 
Docket No. 15358-006220US), and U.S. Application No. 09/521,252, filed March 8, 2000 
(Attorney Docket No. 15358-006300US). 

[0030] A symbolic information capture device is able to capture information stored in 
symbolic presentation documents that may be output during a presentation. For example, a 

25 symbolic information capture device is able to record slides presented at a presentation as a 
sequence of images (e.g., as JPEGs, BMPs, etc.). A symbolic information capture device 
may also be configured to extract the text content of the slides. For example, during a 
PowerPoint™ slide presentation, a symbolic information capture device may record the slides 
by capturing slide transitions (e.g., by capturing keyboard commands) and then extracting the 

30 presentation images based on these transitions. Whiteboard capture devices may include 
devices such as a camera appropriately positioned to capture contents of the whiteboard, a 
screen, a chart, etc. 
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[0031] According to embodiments of the present invention, information from the recorded 
information is used to determine actions to perform. For example, an image of a slide is 
captured and used to trigger an action. 

[0032] In one embodiment, different criteria and actions associated with the criteria are 
5 associated with portions of a source document 108. In one embodiment, a criterion is 

associated with a portion of source document 108 when that criterion is satisfied when it is 
determined that the portion has been output during a presentation. The associated criterion 
may be satisfied when recorded information is compared to a portion of source document 108 
and matching information is found. Although it is described that a criterion may be satisfied, 
10 it will be understood that multiple criteria may be satisfied when a portion of source 

document 108 is determined to match recorded information. The criteria and actions may be 
associated with portions of source document 108 using information embedded in source 
document 108 or information stored separately from source document 108. 

[0033] When a criterion is satisfied, the associated action is then performed. In one 
1 5 example, the criterion may indicate that an action should be performed when recorded 

information is compared to and matches a portion of source document 108. When a slide 
from document 108 is output and displayed, an image of the slide is captured as recorded 
information. The recorded information is compared to find a match to information in a 
source document 108. When a match is determined, the criterion has been satisfied. A 
20 corresponding action for the criterion is then performed. 

[0034] Fig. 2 is a simplified block diagram of a data processing system 200 that may 
incorporate an embodiment of the present invention. As shown in Fig. 2, data processing 
system 200 includes at least one processor 202, which communicates with a number of 
peripheral devices via a bus subsystem 204. These peripheral devices may include a storage 
25 subsystem 206, comprising a memory subsystem 208 and a file storage subsystem 210, user 
interface input devices 212, user interface output devices 214, and a network interface 
subsystem 216. The input and output devices allow user interaction with data processing 
system 202. 

[0035] Network interface subsystem 216 provides an interface to other computer systems, 
30 networks, and storage resources 204. The networks may include the Internet, a local area 
network (LAN), a wide area network (WAN), a wireless network, an intranet, a private 
network, a public network, a switched network, or any other suitable communication 
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network. Network interface subsystem 216 serves as an interface for receiving data from 
other sources and for transmitting data to other sources from data processing system 200. For 
example, may receive the images to be compared via network interface subsystem 216. 
Embodiments of network interface subsystem 216 include an Ethernet card, a modem 
5 (telephone, satellite, cable, ISDN, etc.), (asynchronous) digital subscriber line (DSL) units, 
and the like. 

[0036] User interface input devices 212 may include a keyboard, pointing devices such as a 
mouse, trackball, touchpad, or graphics tablet, a scanner, a barcode scanner, a touchscreen 
incorporated into the display, audio input devices such as voice recognition systems, 
10 microphones, and other types of input devices. In general, use of the term "input device" is 
intended to include all possible types of devices and ways to input information to data 
processing system 200. 

[0037] User interface output devices 214 may include a display subsystem, a printer, a fax 
machine, or non-visual displays such as audio output devices. The display subsystem may be 
1 5 a cathode ray tube (CRT), a flat-panel device such as a liquid crystal display (LCD), or a 
projection device. In general, use of the term "output device" is intended to include all 
possible types of devices and ways to output information from data processing system 200. 

[0038] Storage subsystem 206 may be configured to store the basic programming and data 
constructs that provide the functionality of the present invention. For example, according to 

20 an embodiment of the present invention, software modules implementing the functionality of 
the present invention may be stored in storage subsystem 206. These software modules may 
be executed by processor(s) 202. Storage subsystem 206 may also provide a repository for 
storing data used in accordance with the present invention. For example, the images to be 
compared including the input image and the set of candidate images may be stored in storage 

25 subsystem 206. Storage subsystem 206 may comprise memory subsystem 208 and file/disk 
storage subsystem 210. 

[0039] Memory subsystem 208 may include a number of memories including a main 
random access memory (RAM) 218 for storage of instructions and data during program 
execution and a read only memory (ROM) 220 in which fixed instructions are stored. File 
30 storage subsystem 210 provides persistent (non- volatile) storage for program and data files, 
and may include a hard disk drive, a floppy disk drive along with associated removable 
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media, a Compact Disk Read Only Memory (CD-ROM) drive, an optical drive, removable 
media cartridges, and other like storage media. 

[0040] Bus subsystem 204 provides a mechanism for letting the various components and 
subsystems of data processing system 202 communicate with each other as intended. 
5 Although bus subsystem 204 is shown schematically as a single bus, alternative embodiments 
of the bus subsystem may utilize multiple busses. 

[0041] Data processing system 200 can be of varying types including a personal computer, 
a portable computer, a workstation, a network computer, a mainframe, a kiosk, or any other 
data processing system. Due to the ever-changing nature of computers and networks, the 
10 description of data processing system 200 depicted in Fig. 2 is intended only as a specific 

example for purposes of illustrating the preferred embodiment of the computer system. Many 
other configurations having more or fewer components than the system depicted in Fig. 2 are 
possible. 

[0042] Fig. 3 depicts a simplified flow chart 300 of a method for using recorded 
1 5 information and portions of a source document to determine actions to perform according to 
one embodiment of the present invention. The method may be performed by software 
modules executed by a data processing system, by hardware modules, or combinations 
thereof. Flow chart 300 depicted in Fig. 3 is merely illustrative of an embodiment 
incorporating the present invention and does not limit the scope of the invention as recited in 
20 the claims. One of ordinary skill in the art would recognize variations, modifications, and 
alternatives. 

[0043] In step 302, recorded information is accessed. In one embodiment, recorded 
information may be captured while a presentation is being given. For example, images of 
outputted slides of a source document 108 are captured as the outputted slides are outputted 

25 and displayed. In another embodiment, recorded information may be determined from stored 
information in a multimedia document 1 14. Recorded information may be stored in various 
forms and may be stored in a multimedia document 114. Recorded information may 
comprise images of slides that were captured during a presentation. The recorded 
information can include information of various types such as audio information, video 

30 information, images, keystroke information, etc. A keystroke or voice command may 

indicate that a certain slide (e.g., a page or slide number) has been displayed or that the next 
slide has been displayed. For example, a keystroke may specify a slide number (e.g., the 



number "9" for slide 9) or indicate that a next slide has been displayed. If the keystroke 
indicates that a next slide has been displayed, the slide in source document 108 that 
corresponds to the next slide is determined. A voice command can be interpreted in the same 
way as a keystroke. For example, a voice command may specify a slide number or may 
5 specify that a next slide should be displayed. 

[0044] In step 304, one or more source documents 108 are determined. For discussion 
purposes, it will be assumed that one source document 108 is determined but any number of 
source documents 108 may be determined. In one embodiment, source document 108 maybe 
determined using the recorded information accessed in step 302. For example, the recorded 

10 information is used to determine source document 108 from a plurality of source documents 
108. Information from the recorded information is compared to information stored in source 
documents 128. Based upon the comparison, the recorded information may include 
information that matches information in source document 108. If information in the source 
document matches information in the recorded information, that source document 108 is 

15 determined. For example, a slide in a source document 108 may be outputted and displayed 
and an image of that outputted slide is captured as recorded information. The image in the 
recorded information can then be matched to information (e.g., an image) in source document 
108. Accordingly, source document 108 with the matched image is determined. In other 
embodiments, recorded information may be a key stroke, a voice command, or other 

20 information that may be used to identify a source document 108. For example, the 

keystrokes may identify a source document being presented, etc.] For example, information 
identifying a source document 108, such as a filename, may be received as keystroke. 

[0045] In step 306, a portion of source document 108 determined in step 304 is then 
determined. In one embodiment, a portion of source document 108 may be a slide that 

25 matches information in the recorded information (e.g., an image of the outputted slide). A 
person skilled in the art will appreciate many methods for matching recorded information 
with a portion of the source document 108. In one embodiment, image matching techniques 
disclosed in U.S. Patent Application No. 10/412,757, filed on April 11, 2003, entitled 
"Automated Techniques For Comparing Contents Of Images" may be used and is hereby 

30 incorporated by reference for all purposes. For example, an input image (e.g., recorded 

information) is matched to a set of potential candidate images. The set of candidate images 
may be slides in source document 108. The extracted images may include keyframes 
extracted from video information, images captured during a presentation, etc. For each 

10 



portion of recorded information, images extracted from the recorded information that are 
included in the information are used as input images. 

[0046] In another example, the portion of source document 108 may be determined using 
an identifier for the slide in source document 108. A page number from an image may be 
5 recognized in the recorded information. That page number is then used to identify a portion 
of source document 108. For example, the slide of source document 108 that corresponds to 
the page number is determined, hi other embodiments, the page number may be determined 
from a key stroke or voice command instead of being recognized in an image. 

[0047] In step 308, embodiments of the present invention determine if a criterion is 
10 satisfied. A criterion may be dependent on any number of factors. For example, a criterion 
may be satisfied if a portion of source document 108 determined in step 304 is compared to 
recorded information to determine matching information in the recorded information. For 
example, recorded information may comprise an image that matches a slide from source 
document 108. Also, the recorded information may include a keystroke indicating a slide 
1 5 number that matches a slide number in source document 108. When the captured image of 
the slide matches a portion of source document 1 08, the criterion has been satisfied. 

[0048] Information specifying the criteria can be stored with source document 108 or 
stored separately. In one embodiment, the criterion is associated with source document 108 
in that the criterion is satisfied when a portion of source document 108 is determined using 
20 recorded information. Also, other information may be used to determine that a criterion has 
been satisfied, such as metadata associated with or extracted from source document 108 or 
the recorded information; other documents that are relevant to source document 108 or the 
recorded information; metadata that identifies a user, source document 108, or recorded 
information; an agenda of items, etc. 

25 [0049] In step 310, an action associated with the criterion is determined. Examples of 
actions include translating a slide in source document 108, retrieving a translated slide 
corresponding to an outputted slide from source document 108, sending an e-mail, placing an 
automatic telephone call, initiating a streaming video connection, etc. A data structure, such 
as a table, that includes different criteria and actions may be associated with source document 

30 108. When the portion of source document 108 is compared to recorded information and 
matches information in the recorded information, the table is accessed and searched to 
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determine a criterion that specifies the matched portion of source document 108. A 
corresponding action corresponding to the criterion may then be performed. 

[0050] In step 312, the action is performed. In one embodiment, the action may be 
performed on the determined portion of source document 108. For example, the action is 
5 translating text in the portion of source document 108. Other actions may also be triggered, 
such as sending information to a device. For example, the translated slide may be sent to a 
device, a notification may be sent to a device, etc. 

[0051] Accordingly, actions are determined and triggered based on criteria. The criteria 
may be satisfied when certain events occur. For example, an event may be when a slide from 
10 source document 108 is outputted and presented and captured as recorded information. When 
the recorded information is compared to and matched with information in source document 
108, an action is determined and performed. 

[0052] Fig. 4 illustrates a simplified block diagram of a system 400 for performing actions 
using recorded information and information from a source document 108 according to one 
1 5 embodiment of the present invention. System 400 includes an information processor 402, a 
criteria determiner 404, and an action performer 406. 

[0053] Information processor 402 is configured to identify, based upon the recorded 
information, one or more source documents 108 from a plurality of source documents whose 
portions were presented and captured during a presentation. Information processor 402 is 

20 then configured to identify portions of the identified source documents based upon the 
recorded information. A database 408 may store any number of source documents 108. 
Information processor 402 determines a source document 108 from the stored source 
documents 108 and a portion of the determined source document 108 using the recorded 
information. Accordingly, information processor 402 performs the functions described in 

25 steps 302, 304, and 306 of Fig. 3. 

[0054] Criteria determiner 404 receives recorded information and the determined portion of 
source document 108 from information processor 402 and is configured to determine an 
action to perform. Criteria determiner 404 determines a criterion that is satisfied using the 
recorded information and a portion of source document 108 and performs an action 
30 corresponding to the criterion. As shown, a table 410 may be used to determine if a criterion 
has been satisfied. Table 410 may be stored with the determined source document 108 or be 
stored separately. Table 410 includes one or more criteria and one or more corresponding 

12 



actions for each criterion. The one or more criteria may be that when a slide in source 
document is determined. Thus, a criterion may be satisfied when recorded information is 
compared to source document 108 and it is determined that the recorded information matches 
a slide in source document 108. When the slide is determined, the criterion has been satisfied 
5 and a corresponding action for the criterion is determined. Accordingly, criteria determiner 
404 performs the functions described in steps 308 and 310 of Fig. 3. 

[0055] Action performer 406 receives the determined action from criteria determiner 404 
and is configured to perform the action. Action performer 406 may communicate with one or 
more devices 412. In one embodiment, an action may include performing an action and 

10 sending a result of the action to one or more devices 412. Devices 412 may be personal 

computers (PCs), mobile devices (personal digital assistants (PDAs), cellular phones, etc.), 
televisions, etc. In one example, action performer 406 may translate a slide in source 
document 108 and output the translated slide to devices 112. Also, action performer 406 may 
retrieve a pre-translated slide and output the translated slide to devices 412. A translated 

1 5 slide of a slide being displayed is then displayed on devices 412. In one embodiment, no 
software plug-ins need to be installed on devices 412 in order for actions to be performed. 
For example, translated slides of an outputted slide are automatically displayed on devices 
412 when a slide is outputted. In one example, a web browser may be used to view the 
translated slides. 

20 [0056] The translations may be done before the presentation or in real time during the 
presentation. Before user gives a presentation, his presentation slides may be translated to 
one or more different languages either manually or automatically. Alternatively, while a user 
is giving a presentation, outputted slides from a source document 108 in a first language (e.g., 
English) are automatically or manually translated to one or more other languages (e.g., 

25 Japanese) in real-time. In one embodiment, manually translated means that the slide is 
translated by a user. 

[0057] Recorded information (e.g., a keystroke, a voice command, or an image of an 
outputted slide) may be used to determine which slide the presenter is presenting. The 
recorded information is used to determine a source document 108 whose slide is being 
30 presented and included in the recorded information. The slide from source document 108 is 
translated or a previously translated slide is retrieved. The translated slide is then outputted 
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to a user. Accordingly, action performer 406 performs the functions described in step 312 of 
Fig. 3. 

[0058] Fig. 5 illustrates a system 500 that depicts the automatic translation of slides being 
presented according to one embodiment of the present invention. As shown, system 500 
5 includes a display device 104, capture device 1 10, and a device 412. Display device 104 
displays an outputted slide 106 from a source document 108 in a first language (e.g., 
English). Presentation recorder 110 captures recorded information 502 from display device 
104. Alternatively, recorded information 502 may be retrieved from information in a 
multimedia document 1 14. In one embodiment, recorded information 502 includes an image 
10 (e.g., a joint photographies experts group (JPEG) image) of a displayed slide in display 
device 104. Recorded information 502 may also include a keystroke or voice command 
indicating which slide is being displayed. 

[0059] A slide is outputted from a source document 108. For example, source document 
108 may be a Powerpoint™ file 504 that includes slides in a first language (e.g., English). 

1 5 Slides in PPT file 504 can be translated into another language (e.g., Japanese). The 

translation may occur at any time (e.g., before recorded information 502 is captured, while 
recorded information 502 is being captured, or after recorded information 502 is captured). If 
slides are translated before the presentation, a translated slide is retrieved from a file (e.g., a 
PPT file 506). If the translation is not done beforehand, a slide in PPT file 504 may be 

20 translated in real-time. 

[0060] In one embodiment, PPT file 504 can be associated with information, such as 
metadata (e.g., an e-mail of the person who submitted PPT file 504, IP address of a capture 
device, time stamp, user identifier, original language, directories where the file is stored etc.). 
The information may be used to determine if a criterion is satisfied or in performing an 
25 action. Slides in PPT file 504 can be translated into many languages and individual PPT files 
may be created for each language. For discussion purposes, slides will be translated into one 
language but it will be understood that slides can be translated into multiple languages. 

[0061] Recorded information 502 is matched to information in PPT file 504. As shown, 
slides 508 include information from PPT file 504. In one embodiment, JPEGs and text from 
30 slides in PPT file 504 are extracted. The extracted JPEGs and text are then matched to 

recorded information 502. For example, information in recorded information 502 is included 
in a slide 508 in PPT file 504. In one embodiment, an image of a slide extracted from the 
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recorded information matches a slide in a source document 108. The slide may be 
determined using techniques where information in recorded information 502 is compared to 
see if there is any matching information to information in PPT file 504. Also, a slide number 
may be identified in an image in recorded information 502 and the slide number is used to 
5 retrieve a slide in PPT file 504. Keystrokes or voice commands stored as part of the recorded 
information may also be used to identify a slide in PPT file 504. 

[0062] Once the slide in PPT file 504 is identified, a translated slide 510 corresponding to 
the identified slide in PPT file 504 is determined. In one embodiment, translated slide 510 
may have been previously translated based upon slide 508 prior to processing of the recorded 
10 information. In another embodiment, slide 508 may be translated into slide 510 in real-time 
upon receiving a signal to perform the translation. 

[0063] Translated slide 5 10 is then communicated to device 412, which displays the 
translated slide. Thus, instead of viewing a first slide in a first language, a translated slide of 
the first slide in a second language is automatically displayed when the first slide is outputted 
1 5 and displayed on display 104. If a matching slide in PPT file 504 is not found, the outputted 
slide displayed on display 104 may be displayed on device 412. The above process may be 
repeated for any number of outputted slides from PPT file 504. 

[0064] In another embodiment, a presentation may be given as above and translated slides 
are determined. Instead of outputting the slides in real-time, the translated slides may be 
20 saved to a file. As slides are displayed, the slides are translated or translated slides are 

retrieved and saved to the file. A user can then view the translated version of the slides for 
the presentation at a later time. 

[0065] In another example, information in an agenda may be used to determine actions to 
perform. When an agenda item is displayed and captured as recorded information, a 
25 notification that the agenda item is being discussed is sent. 

[0066] While the present invention has been described using a particular combination of 
hardware and software implemented in the form of control logic, it should be recognized that 
other combinations of hardware and software are also within the scope of the present 
invention. The present invention may be implemented only in hardware, or only in software, 
30 or using combinations thereof. 
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[0067] The above description is illustrative but not restrictive. Many variations of the 
invention will become apparent to those skilled in the art upon review of the disclosure. The 
scope of the invention should, therefore, be determined not with reference to the above 
description, but instead should be determined with reference to the pending claims along with 
5 their full scope or equivalents. 
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